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CN ' Abstract 

We derive improved bounds on the error and erasure rate for spherical codes and for binary linear 
codes under Forney's erasure/list decoding scheme and prove some related results. 
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1 Introduction 

The subject of error bounds for various decoding schemes has been a central topic in information theory in 
its first decades. With the success of turbo codes and other iterative decoding schemes this subject again 
became the focus of continued attention through the last decade. In the early days the major effort in deriving 
error bounds went into establishing the best attainable error exponents (for instance, Shannon's reliability 
function of channels). This approach is reflected in most textbooks on information theory that deal with 
this subject (UQOlEIlElE]]. Lately the attention has shifted from considering average properties of code 
ensembles to bounding the error probability of decoding of a particular code whose distance distribution is 
known or can be estimated. Focusing on a particular code instead of an ensemble of codes makes it possible 
to analyze the error probability by a geometric approach rather than Chernoff bounds. These studies gained 
momentum after influential research of G. Poltyrev in fl3l fT4ft : see 1 16 1 and references therein. 
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It is interesting to note that Shannon [17] also relied on a geometric derivation in his paper on the er- 
ror bounds for spherical codes and the Gaussian channel. The starting point of the present research was 
an attempt to derive Shannon's results via the distance distribution of the code (recall that about the orig- 
inal derivation the author wrote: "It might be said that the algebra involved is in several places unusually 
tedious"). It turns out that in this way the results of lUTl can be obtained by a simpler, more intuitive argu- 
ment. To add a new element to this study, we consider a version of Forney's erasure/list decoding scheme 
171 . Q . To define it, let C be a code in a metric (observation) space X with the metric d(-, •) and let t > 0. 
The decoding function ip t is defined as follows: tpt(y) = x if for all code vectors x' ^ x the distance 
d(x', y) — d(x, y) > 2t. For all other points in X the decoding result is undefined and will be called erasure 
below. 

We will be interested in the best attainable exponents of error and erasure probabilities, denoted E e and 
E x below. Error bounds for this decoding for general discrete memoryless channels were derived in [7 1, [ 8 1. 
In particular, they imply bounds on E e and E x for unrestricted codes in the Hamming space used over a 
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binary symmetric channel. The case of linear codes was addressed by Blokh and Zyablov |3 1. Error bounds 
for this decoding method in the case of spherical codes are not available in the literature. 

The text is organized as follows. In Sect. |2]we address the technically easier and more familiar case of 
binary linear codes. The main goal of this part is to develop geometric intuition in a more familiar situation 
and then to rely on it in a more difficult case of spherical codes. However, as a byproduct, we obtain 
an improvement of the bounds of [3 ] on E e and E x . Moreover, the method we use is arguably easier to 
understand than the results in (3J- 

In Sect. [3] we consider the case of bounded distance decoding and some other related questions. 

In the second part we study spherical codes. For the Gaussian channel we obtain a pair of bounds that 
specifies the trade-off between the error and erasure events. For t = the bounds reduce to Shannon's lower 
bound on the error exponent of maximum likelihood decoding. In our calculations we rely on the distance 
distribution of codes. Note that Shannon's derivation [ 17], although geometric in nature, takes a somewhat 
different route, performing averaging of the error probability over the choice of codes. This method is not 
the best known for low noise because average codes contain small distances, so expurgation of the code 
ensemble is needed to obtain a good bound for low rates. In contrast, we begin with choosing codes with 
large minimum distance and obtain the complete result by a single argument. Since we operate in terms of 
the distance distribution, we will obtain some new insights into the decoding geometry of spherical codes 
in the course of our derivation. We also outline a derivation of Shannon's error bounds [ 17 1 by an approach 
which is arguably simpler than both the original proof and Gallager's proof in HI 01 . The proof method 
considered exhibits a close analogy between spherical codes and codes in {0, l} n if one makes allowance 
for some peculiarities of discrete geometry. 

We also derive error bounds for bounded distance decoding of spherical codes. This problem was 
mentioned in [20], however the focus of that paper is on different questions. In particular, we address the 
question of the probability of undetected error with spherical codes, in the sense specified in the main text, 
and establish the asymptotic behavior of this quantity. 

2 The binary case 

Let X = {0, l} n be the binary Hamming space with distance d(-, •). We consider linear codes C C X of 
rate R = n _1 log 2 \C\ used over a binary symmetric channel with crossover probability p G (0, 1/2). 

For a code C C X consider a decoding mapping mapping ip t '■ X — » C defined as follows: Vt( x ) = c 
if for all code vectors c' / c the distance d(c',x) — d(c, x) > 2t for some nonnegative integer t = rn. 
For all other points in X the decoding result is undefined, and will be called erasure below. For the case of 
complete decoding we write ip instead of ipQ. 

Let us introduce notation. Denote by A w , w = 0,1, ... ,n the weight distribution of C. For a code of 
minimum distance d we have Aq = 1, A\ = ■ ■ ■ = Ad-i = 0. Let us introduce the weight profile of the 
code: for u = w/n, w = 0, 1, . . . , n let 
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where log 
Let Aq 



l,A w = [{ n w )2-^- R )\, W = l,. 



—00. 



n, 



T{x,y) 



xlog 2 y- (1 - x)log 2 (l - y) 
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where 



h(x) := T(x, x),D(x\\y) := T(x, y) - h(x). Throughout the rest of the text 5 G y(R) = - R) is the 

relative Gilbert- Varshamov (GV) distance and d = dcv = \&Gv{R)n\ • Let Eq(R,p) be the Gallager bound 
on the reliability function of the channel [9 pp. 34-36]: 

-<fe V log 2 2Vp(l-p) 0<R<Re, (a) 
E (R, P ) = \ D( P0 \\p) +R C -R R e < R< Rc, (b) (1) 

D(6 Gy (R)\\p), R c <R<l-h(p), (c) 

Po = r- — 7j , ^0 = 2p (l - Po), 

y/p+y/l-p 

R e = l- h(u Q ), R c = 1 - h(p ). 
Denote by S r (0) a ball of radius r in X with center at and by 

Pi,j = \{z£X : d(z,x) =i,d(z,y) = j;d(x,y) = 

the number of triangles in X with a fixed side of length k. Let v = log 2 ((l — p) /p). 

For unrestricted codes various lower bounds on the exponents E e ,E x were given in Forney [8|. For 
linear binary codes the following theorem was proved by Blokh and Zyablov. 



Theorem 1 [3| For < R < R c 



E e (R,p,r) > E (R,p) + vt (2) 
E x (R, P ,t) > E (R,p)-ut, (3) 



For R > R r 



E e (R,p,r) > E (R, P ) + 2tD'(6\\p)\ s=5gAR) (4) 
E x (R,p,r) > E (R,p)-2tD'(5\\p)\ s=Sgv{r) . (5) 

Note that the case t = corresponds to maximum likelihood decoding, and the bound on E e turns into 
Eq. Erasure rate in this case is of course zero though (|3), © give a positive value, because by the nature of 
the argument the erasure probability P x is estimated by the sum P e + P x . 

Remark 1: Note also that by ©, the exponent E x = for rates in the range close to the channel capacity. 
In this range the value of the undetected error exponent E e can be claimed arbitrarily large if we modify the 
decoding function to claim an erasure for all transmissions. Thus, in effect Theorem ^ contains a nontrivial 
claim only for those values of the code rate Rfor which E x > 0, i.e., for which the right-hand side of © is 
positive. Of course, even when E x = i.e. decoding results in erasures in almost all transmissions, it is still 
useful to know how often we will run into an undetected error. 

The aim of this section is to derive lower bounds on the exponents which are better than the estimates 
©-© for most values of R > 0. To state the results we need the following definitions: u = p(l — p), 



± _ y / n + r 2 (l-2p) 2 - p(l ±2t)±t 

Po ~ r^Tp 

w = 2 /0o t (l- /0o t )±2r(l-2p t ) 
_ y/u + T 2 (l - Au) - 2u 



l-4u 
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Figure 1: Error bounds from Theorems ^ and |2](p = 0.07, r = 0.03). In each pair the 
better bound is from Thm. |2] The dashed line is the function Eq(R,p). 

( -5 Gy {R){h{\ + ^) + \\og 2 u)±uT 0<R<l-h(u o ) (a) 
M± = I D(p±\\p) + 1-R- h(p± =F 2r) 1 - h(u ) <R<1- h(p±) (b) 

{ D(5 GW ±2r\\p) R>l-h{f%). (c) 

We then have the following result whose proof is given in the appendix. 

Theorem 2 Let R > 1 — h(0.5 — t), then the exponent of the undetected error is bounded below as 

E e (R,p,r)>M + . (6) 

Let R > 0, t < p/2, then the erasure exponent is bounded below as 

E x (R,p,r)>M_. (7) 

Remark [2 applies to this theorem as well: the claim of the theorem is nontrivial for code rates below 
l-7»(p + 2r). 

For t = the bounds also reduce to Eq(R,p), as expected. However, they are strictly greater that 
the bounds of Theorem^ For instance, for the case (c) this can be proved using the fact that D(5\\p) is a 
U-convex increasing function of 5 for 5 > p: 

M+ = D(S Gy (R) + 2r\\p) > D(S GY (R)\\p) + 2tD' 5 (5 gy (R)\\ P ) 

It is also easy to establish similar inequalities in the other cases. Typified behavior of the bounds on 
E(R, p, r) from Theorems ffl and |2 is shown in Fig. [2 These theorems and the other results in the binary 
case extend in a standard way to binary-input output-symmetric discrete memoryless channels and to the 
q-aiy symmetric channel, q > 2. 

Remark 2: The conditions R > 1 — h(l/2 — r) and r < p/2 seem to make Theorem |2] sound more 
restrictive than Theorem ffl It is possible to remove these conditions and prove somewhat weaker bounds 
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which will still improve upon Theorem ^ However, the first of the two conditions for small r is not a 
substantial restriction of the range of codes rates: for instance, for r = 0.03 the bound on E e is valid for all 
code rates R > 0.0025. Furthermore, it is often the case that the bounds (JSJi-© are void while Theorem|2] 
claims nontrivial results. For instance, for (p, r) = (0.2, 0.09) the bound (J5} and hence the rest of Theorem 
^is trivial while Theorem |2] gives nontrivial exponential error bounds for small values of the code rate. 

Remark 3: As indicated above, TheoremHcan be obtained by a small modification of the proof of the our 
result. Generally, Theorem [2 claims results weaker than those in ©-Q because the authors of [3 1 in their 
derivation relied on a suboptimal decision region. 

Remark 4: We can add some details on typical error events in the course of decoding. For instance, 
consider the error-only case. Let p typ be the relative weight of error vectors that lead to a decoding error, and 
let w typ be the relative weight of code vectors obtained as a result of incorrect decoding. From the proof it is 
clear that for the case (a), p typ = (1 — 5gv)p+Sqy /2 + t, oj typ = 5gw- For the case (b), p typ = po, u> typ = ujq- 
Finally, for the case (c), p typ = 5gy, w typ = 25ow(l — <5gv) + 2r(l — 25qw)- A more detailed discussion of 
these results for r = is provided by 

Remark 5: Note an alternative expression for the case (b) of M± 

M± = l-R- h(u ) - uj h(- ± — ) - ^ log 2 u ± vt. 

V 2 ujq / 2 

We stress that the dependence of the bound on r is essentially nonlinear, contrary to the closed-form bounds 
in Gfl.GI. 

3 Related results: The binary case 

1 . Let us address the question of error bounds for a specific code under max-likelihood decoding. Let C be 
a code with distance distribution A w = 2 na ^ (uj = w /n, < w < n) and let 

K(C) = max ™ , 

i<w<n max(l, Aw) 

k{C) := n _1 log 2 K(C). For simplicity only we put r = 0. The following bound is straightforward. 
Theorem 3 

n- 1 log-±—>mzx(D,E {R,p)-K{C))-o(l), (8) 



where 



D = - ^ax^ (a(u) + (w/2) log 2 (4n)) . 



PROOF. Denote by P e {w) the error probability under the condition that the decoded vector is w away from 
the transmitted one. Then 



w=l 

Taking logarithms and switching to exponents, we obtain the first part of the claim. The second part is 
equally obvious because 

P de (C) <YA w P e (w) < 2- n ^K{C) (j P eH 

w=l w=d 



< 2 



n(K(C)-E (R,p)-o(l)) 
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Note that if R e < R < R + k(C) < R c , then Eq(R,p) is given by a linear function with slope —1, and 
we can write Eq(R,p) — k(C) = Eq(R + k(C),p). The second part of the bound under the maximum in 
© is the main result of [ 18 1, see Theorem 1 of that paper]. The above proof is a shorter way to obtain it. 

Note also that for low R bound D on the error rate of C can be better (and is never worse) than the 
second part of ©. This is due to the fact that in D we maximize the product of the weight profile and the 
pairwise error probability, while in the second bound the maximization of these two terms is separate. 

2. Consider the decoding procedure tp t under which ^t(x) = c if d(c,x) < t and V"i( x ) undefined if 
such a code vector does not exist. In this case the calculation of the error exponent is cumbersome, and 
depends on the relation between p and t. One particular case is easy to analyze. 

Proposition 4 Let C be a linear binary code with weight distribution Ai,i = 0, . . . , n. Suppose that for 
every d < w < n the maximum of 

* / \ / \ 

"' -i-i 



E £(7)(VW-ri~ 

fr...in ... +\ e—n \ / \ / 



is attained for I = 0, i = w — t. Then — n _1 log Pd e {C) > E e (R,p, r) — o(l), where 



E e (R,p,r) = 
Proof, (outline) We have 



-5h{j/5) - T(5-T,p), 0<R<1- h(p + r(l - p)) 

l-R-h(r) -rbg a (l-p) 1 - h(p + r(l - p)) < R. 



Pde (o < (t + if ± (i) ( w _ y-\i - p) 

w=d 



n—w+t 



In the sum on w the summation term is maximized for w ~ n(p + r(l — p)). The exponent in question is 
obtained by computing the logarithms and depends on the sign of p + r (1 — p) — 5. For p + r(l — p) < 5 the 
dominating term is the one with w = d. Upon simplification we obtain the first case of the claimed bound. 
Otherwise the maximum is within the summation range. Taking logarithms, substituting oj = p + r(l — p) 
and simplifying, we obtain the second case. I 

In particular, let t = 0, which corresponds to the case of pure error detection. Then E e (R,p, r) reduces 
to the well-known lower bound on the exponent of undetected error ATI . 



4 Spherical codes 

In this section we address the problem of error bounds for erasure decoding for the case of spherical codes. 
We assume transmission over a Gaussian channel with signal-to-noise ratio A. Let S" 1_1 (r, x) be the sphere 
in R n of radius r with center at x. We will write 5' n_1 (r) for 5' n ~ 1 (r, 0). 

Let X = S n ~ 1 (yAn) and let yi, y2 £ X be two vectors. One way to measure the distance between 
them is by the angle Z(yi,y2), and we will write d(yi,y2) = <fi if this angle equals 4>. The distance of a 
code C C X is defined in the usual way as the minimum pairwise distance in C. For a given vector x if 
y = x + z and d(y, x) = <fi, we will say that z has weight <fi. 
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For a code C G X let M be its size, R = n 1 In M its rate and 9 = 9(C) its distance. We also define 
the distance distribution of C as follows: 

B(s,t) = M^K^x' G C : s < d(x,x) < t}\ 

and = -6(0, s), so that M = J? dB(x). Given a family of codes, we call the function b(x) its distance 
profile if 

b[x) = lim (1/n) In B[x — e, a; + e) 



n — >oo 



assuming that the limit exists. Throughout this and the next section we use the notation 9 S = 9 S (R) := 
arcsin e~ R . 

The decoding mapping for C is defined as follows: tp T (y) = x if d(y,x') — d(y,x) > 2t for all 
x' G C, x' ^ x. If such code vector x does not exist, decoding results in an erasure. Assume that the 
transmitted vector x is displaced by a noise vector z whose coordinates are i.i.d. Gaussian random variables 
with mean and unit variance. Let E e (R, A, r) and E X (R, A, r) be the best attainable exponents of the error 
and erasure rate, respectively. When r = 0, this is the usual complete decoding, and E e is the reliability 
function of the Gaussian channel. In this case we will omit from our notation and write rp, E(R, A). The 
following lower bound on E(R, A) is classical fT7l : let 9 = 9 S (R), then E(R, A) > E (9, A), where 



E (9,A) 



4(1- cos 9), \>9>9 e , (a) 

4(l-cos0 e ) + m|^, 9 e >9>9 c , (b) (9) 

E sp (9,A), 9 C >9> arccot y/A, (c) 



where esc 2 9 e = \ + ±^1 + ^, esc 2 9 C = \ + | + + ^ 

E sp {(t>,A) := — — g(<j)) cos (p- ln(g (<f)) sin <p), 

1 



g{<j),A) := -(VAcos(/)+ \J ' A cos 2 4> + 4) . 

This bound will follow as a special case of our derivation. 

The (Shannon) volume, or sphere packing bound [ 17 ] establishes the existence of codes of rate R with 
distance arbitrarily close to 9 S (R). It is also straightforward to prove that there exists a code C of rate R 
with distance 9 S and distance distribution 

B{s) < p{n)e niR+lnsine) (9 s <9<tt- 9 S ), 

where p(n) is some polynomial function. This distribution is induced by the (normalized) uniform measure 
on 5 n_1 and therefore plays the role analogous to that of the binomial distribution in the Hamming space. 
The distance profile corresponding to it is (3(R, 9) = R-\- sin 9. We will examine the behavior of the error 
rate with decoding ip T being applied to sequences of codes C with these properties. 

Below we track only one of the two cases, the error-only event, and state results for the erasure rate 
sparing the reader the detailed analysis. Our goal will be to establish the following theorem. 

Theorem 5 Let R be the code rate, let 9 S = 9 S (R) be the code distance and let r > 0. The exponent 
E e (R, A, t) is bounded below by M(R), where for ir/2 > 9 S > 9\ 

M{R) = -(1 - cos(9 s + t)) - G(9 s ,t), (10) 
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for6 1 >e s >6 2 

M(R) = i (1 - cos(0i + r)) + In smd ^ _ G (fl 1)T ) (H) 
4 sin #i 

and for 62 > O s 

M(R) = E sp (p,A). (12) 

//ere 



G{4>,T) = \hx 



1 + 



,4 cos 2 ^ (sin 2 ^ - sin 2 ( | + r) ) 
cos 2 (|+r) 



. 2 

f/ie (real, positive-valued) function defined implicitly by the equation 



(13) 



cos 2 xtan(f + r) 

"*' = a,(» + 2T)-co.L (14) 



p = p(-R) G [t«, 2i s ] is the unique angle such that 

1 / tan 2 ((0(p)/2 ) + 7) 

2 V tan 2 p 

« ?/je roof 0/ 



/2 + lnsin0(p) + -ln(l Z!i J ~ J = °> (15) 



d A 

— (lnsinx + — cos(x + t) + G(x,t)) = (16) 
dx 4 ' 

ant/ 6*2 = 9 S (R*), where R* is the root of6(p(R)) = 8\. 

A lower bound on the exponent E X (R, A, r) is obtained on replacing r by —r throughout. 

Although it is not immediately seen, for r = we have M(R) = Eq(R), so in this case the bounds 
simplify significantly For instance, G(4>, 0) = 0, and the bound (flOl reduces to ((9^), the value p equals 
6 S (R), the angle 6\ is simply 9 e of ©-b), and so on. We explain these and indicate further connections 
with bound © in remark |6] below. Note that though there seems to be no closed-form expression for the 
exponents, it is easy to compute them for any given A, r. It helps to observe that on substituting 9(p) into 
dl 5b . this equation contains only one unknown, p. We show the behavior of the bounds in Fig. Note that 
M(p) > for < p < arccot vA Note also that G((f), r) is negative (and usually small), so on omitting it 
from expressions dTOb . <fTTT> we still obtain valid lower bounds. 

The remaining part of this section is devoted to the proof of this theorem. We begin with some notation 
and technical results. Let Con(x, (j)) denote the circular cone with apex at the origin, axis given by a vector 
x£R" and solid half-angle (j). We write f(n) = g(n) if linin^oo A In ^4 = 0. 

We will need the following lemmas. 

Lemma 6 [17| Let x G X and z a random Gaussian vector. Let Q((f>) be the probability that x + z 
Con(x, (j)). Then Q(<j>) 9* e -nE sp (<j>,A) and -dQ(<j>) e~ nE °' 



Lemma 7 [ 17 1 Let T((j>) be the area of the spherical cap on the sphere S n 1 (r) cut out by a cone Con(x, 1 
Then as n —* 00 

2vr( n - 1 )/ 2 r n - 1 sin"" 1 
^ ' ~ (n-l)r(2fi)cos0 
For the normalized area fl(<p) = T(cp)/T(n) we have 

Sl(</>) = (sin0) n . 
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Figure 2: Error bounds from Theorem|5](^4 = 4, r = 0.04 « 2.3°). The dashed line 
is the function Eq(R,A). 



Lemma 8 (e.g. |6| p.65]) (Laplace method) Let 



b 



h(X) = / e Xq ^dx (-00 < a < < b < oo). 



Suppose that the integral converges absolutely at least for sufficiently large A. 

(i) Suppose that the absolute maximum of q{x) in [a, b] is attained for x = 0, that q'(0) exists and is 
continuous in some neighborhood ofO, and that q"(0) < 0. Then 



(ii) Suppose that a = and that the absolute maximum of q{x) in [a, b] is attained for x = 0. Then 

e Ag(0) 

provided that q'(0) < 0. 

Shannon's approach to bounding the rate of error events is as follows. Let £ denote one of the two 
events: error, or error or erasure. 

Lemma 9 1171 Let z be the channel error vector. Then 

P(£, C) < mmP(£\w(z) < p) + P(w(z) > p). 

p 

Generally, the minimum is attained for different p depending on the meaning of £. Note that to obtain a 
valid bound we do not have to optimize on p, taking an arbitrary value at our convenience. Below we always 
assume that p < 90° . 
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Figure 3: Derivation of Lemma [lOl 



Lemma 10 Let xi E C be transmitted and let P#( x i x.2) be the probability that decoding ip T mistakes 
x-ifor a fixed code vector X2 with d(x\, X2) = 9. Then Pe(x\ — > X2) = F(8, t), where 

F(9,t):= [ (l- tan2 ^ 2 2 + r Y /2 e-^(^)^. 
J V tan^ / 

9/2+t 

PROOF. Let z be the error vector with w(z) = </>, ||z|| = r. Let us compute the fraction of such errors that 
lead to a decoding error that outputs X2. For this to happen it suffices that > 6/2 + t and ||x2 — y|| < 
ll x i — y II j where y = xi + z is the received vector. This fraction equals the normalized area of the spherical 
cap cut out on the surface of Con(x, cf)) by the hyperplane perpendicular to xi and located at a distance r 
from the origin. Taking in Fig. |3]t = + t, we compute for the angle a of this cap 

.2/01 ( riMe/2 + T)2 1 tan 2 (fl/2 + r) 

sin a/2 = 1 — ( ; ) =1 ^— 

r tan <p tan z cp 

The normalized area of the cap in question is given by Lemma0 

Q ^ (sina/2) n 

and does not depend on the distance r from the origin. Hence we may integrate r out and obtain for the 
differential probability 

P( Xl - x 2 |0 < *(>) < + #) = -(1 - ^^y^ )" 72 ^!^ 

Now the claim of the lemma is obtained by integrating on 4> from 9/2 + t (because errors of smaller 
weight cannot lead to decoding error) to p (because errors of greater weight are assumed to always lead to a 
decoding error) and substituting dQ{ip) from Lemma|6] I 

Putting the pieces together, we obtain the following bound on the probability of error for spherical codes 
under tp T . 
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Theorem 11 Let C be a code of rate R and distance 9 with distance profile b{6). Then for any p £ 
(0/2,tt/2) 

P(£-,C)< [ " T e nb ^F(9,r)d9 + Q(p). (17) 

PROOF. Follows on applying Lemmas [5] and [TO] and the union bound. I 

Now we are ready to complete the proof of Theorem [5J First let us find the asymptotic behavior of 
F(9, t). We have F(9, r) ^ / e nqi -^d<j), where 

1, / tan 2 (6»/2 + r)\ „ , , 

From the equation q'(<p) = we find that the maximum of the integrand is attained for 4>q(t) defined by 

. 2 4 + Asm 2 {9 + 2r) 

Sm ^ = 2(2 + A + A C os(9 + 2r)y ^ 

The asymptotic value of the integral is obtained by Lemma|S]and depends on the location of </>o with respect 
to the integration limits. First, it is easy to see that 4>$ > 6/2 + r for any < A < oo,0 < 6 < tt — 2t. 
Indeed, it suffices to show that sin 2 c/>o > sin 2 (9/2 + r). Therefore, compute 

Q 

2(2 + A+Acos(9 + 2r))(sin 2 <p - sin 2 (- + r)) 

Q 

= 4 + A sin 2 (9 + 2r) - 2 sin 2 (- + r)(2 + A + A cos{9 + 2r)) 

= 4(1 -sin 2 (- + r)) > 0. 

It remains to examine the location of 0o with respect to the upper limit of integration, p. We have the two 
following cases. 

1. 0o < P- Then by Lemma[8ti) the behavior of the integral is determined by <p in the neighborhood of 
4>q . We obtain 



n^lnF((9,r)~lln(l- tan(e/ 2 2 + T) )-^ 



2-V tJ^)- E ^ A) - 

Next let us proceed to computing the asymptotic expression for the outer integral in ( fTTl . Denoting the 
integrand by D, substituting the value of <fto and taking b(9) = f3(R, 9), after all simplifications, we arrive 
at the expression 

-n- 1 InD ~ -(1 - cos(fl + r)) - (3(R, 9) - G(9, r). 

Now invoke again Lemma[8] The main term of the integral depends on the relative location of the maximiz- 
ing value of 9, denoted by 9±, and the integration limits. As it turns out, for 9\ we have < 9\ < 2p — 2r, 
so what matters is the mutual location of 9\ and 9 S . If 9\ < 9 S , then by Lemma |8lii) the main term is 
determined by 9 = 9 S . Since /3(R, 9 S ) = 0, we obtain for E e (R, A, r) the bound 

E e (R,A,T) > j(l-COB(e,+T))-G(e„T), 

which is ( fTUt . On the other hand, if 9\ > 9 S , then we use part (i) of the same lemma and obtain the bound 
dm . This proves the first two parts of the theorem except the upper limit 92 of range of angles in (fTTT i which 
will be established later. 
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2. 0o > P- Now the asymptotic value of F(0, r) is determined by = p, and so 

In F(9, r) ~ ± In f 1 - tau 2 (g/2 + r) x _ 



2 \ tan^ p 

We proceed to computing the asymptotic expression for the outer integral in ( flTt . Again taking b{6) = 
(3(R, 9) and denoting the integrand by D, we obtain 

n- 1 InD ~ i? + In sin0 + - In ( 1 - tan " ( g / 2 + r ) ^ _ # ( A) . 

2 \ tan z p ) 

Differentiating, we find that the maximum of this expression on 9 is attained for 9 = 0(p), and it is possible 
to prove that with our choice of p the value 9{p) is always within the integration range: 9 S < 9{p) < 2{p— r). 
Furthermore, by (173! the first three terms in the expression for D add up to zero. Concluding, in this case 
the integral on 9 evaluates asymptotically to 

2p-2r 

e n{f3{R,0)+\nF{0,T)) d Q ^ e -nE sp (p,A) _ 

S 

Clearly, the second term in (flTT) has the same asymptotic behavior, which is therefore the answer in the case 
studied. It remains to find the value 92 when the main term of the estimate moves from (TTTt to (fl2l . This 
obviously happens when the two functions first become equal as the angle 9 S decreases from 9\ or when the 
rate R reaches the value such that 9{p{R)) = 9\, or when 9 S = 9i- This concludes the proof of (fT2l and 
thus of the theorem. I 

Remark 6: The results of [ 17 1 are obtained from this theorem by substituting r = in Denoting 
9(p) in this case by 9e, we find that cosOe = cos 2 p. Further, substituting 9e into (fT3l . we find that 
p(R) = 9 S (R), i.e., the optimizing value of the decoding radius in © in this case is 9 S . Taking r = in 
ill 6b . we obtain for 0\ the explicit equation cos#i = (A/ A) sin 2 9\ whence 9\ = 9 e . Further, the equation 
for R* which in general is 9(p(R)) = 9\, now reduces to 9e = 9\ or 



esc 2 B E = [sin 2 9 S (R*){2 - sin 2 9 S {R*))}- 1 = i (l + ^ 1 + — ) . (19) 

From this we find R* = — In sin 9 C , or 92 = 9 C of (|9p-c). Hence 6*2 equals the critical angle and R* equals 
the critical rate of the channel. 

These remarks also enable us to make some observations on typical error events in the course of decoding 
of codes C. They are easier understood for r = 0. Since the codes C generally are not distance invariant, the 
following is valid on average only. 

1. Suppose that tt/2 > 9 S > 9 e , then the errors that contribute to the main term of E(R,A) most 
probably are of weight G (9 S /2,9 S ). In the case of decoding error the typical distance of the 
output code vector from the transmitted one equals 6 S and does not depend on the level of noise in the 
channel. 

2. For B e > B s > 9 C , typical errors are also of weight 0o and result into code vectors at distance 6 e from 
the transmitted one. From the moment that 9e = 9 e or 6 C = 9 S , typical errors are of weight 6 S and 
the resulting code vectors are distance 9e from the transmitted one. The error rate in this case does 
not depend on the actual channel noise or on the distance of the code. 
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Remark 7: (The Elias angle). Note that the value 8e gives the answer to the following geometric question. 
Let x £ C be a vector in a code of rate R and distance S . Consider all the neighbors x' of x in C such that 
d(x, x') = a for a given a and draw the cones Con(x', 8 S ) about them. What is the minimum value of a 
such that fraction of the surface of Con(x, 8 S ) covered by these cones asymptotically becomes one? The 
answer follows from dl5t and is given by a = 8e- This parameter plays the same role for S 71 ^ 1 as the Elias 
radius for the Hamming space (see, e.g., (Q), therefore we call it the Elias angle. This also hints, by the 
same geometric argument as in the Hamming case at the bound 9 < 8e for the maximal attainable minimum 
distance of a spherical code of rate R. Solving the first inequality in ( fl9l for R, we obtain a different form 
of this bound, namely R < — ln(\/2 sin(0/2)). This is an old bound of Rankin [ 15 1 and Coxeter [4| on the 
rate of a spherical code of distance 8 whose proof we therefore obtain. 

Remark 8: Note that though we emphasized codes C in our derivation, many parts of it, such as bound 
dT7l . apply to any sequence of codes with a known distance profile. They are also applicable to binary codes 
used over the binary-input Gaussian channel. Let C be a binary spherical code, i.e. a subset of S' n_1 (\/ An) 
such that coordinates of every vector in C take values ±y/A. Let d(C) and 8(C) be the minimum Hamming 
and angular distance in C respectively, then d(C) = n(l — cos 9(C)) /2. We can specialize bound (flTt to 
this case as follows: 

[n(l-cos 2p)/2j 

P(£,C)< Yl A w F(8 w ,t)+Q( p ), (20) 

w=d(C) 

where (Ad, • • • , A n ) is the distribution vector of Hamming distances in C and 8 W = arccos(l — 2w/n). It 
is straightforward to compute the trade-off bounds analogous to Theorem |5] For r = they reduce to a 
bound on the error rate of complete decoding for C which can be used for finite length as well. For that 
purpose, more accurate approximations on F(8, r) than those used above are readily available. In particular, 
the normalized area of the spherical cap can be computed with arbitrary precision from the asymptotic series 
provided by the Laplace method [ 12 1, and a more precise expression for Q(<p) than the one quoted in Lemma 
|6]is given in FP71 Eq. (51)]. Asymptotically d20t becomes the same as Poltyrev's "tangential-sphere" bound 
1 13]; for binary linear codes with binomial weight spectrum A% (see Sect. 13 we immediately recover the 
part of the random coding exponent below the cutoff rate. 



5 Related results: Bounded distance decoding and error detection 

Let us address a related question, that of error exponents for bounded distance decoding of spherical codes. 
Consider the following partial decoding mapping tp T : X — > C : if y is within distance r of a code vector x, 
then ip T (y) = x, and if there is not such x, the value of Vv(y) is undefined. Recall that by distance d(x, y) 
we mean the angle Z(x, y). Again we are interested in the best attainable error exponent of such decoding 
for spherical codes. The following proposition is obvious if in Lemma|9]we take p = n/2, and use Lemmas 
01 and© 

Proposition 12 Let C be a code in X with distance 8(C) > and distance profile b(6). Then for any e > 
the probability of decoding error 

Pde(C)< I 'J ^ e n H + ^ ln ( 1 - i '^))dQ(0)^ + e-"^( 7r / 2 -^), (21) 
where 8(C) < 8 < vr/2 - r - e and max(0(C) /2, - r) < <j) < 8 + r. 
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Note that the choice of the upper limit 9 < n/2 — r — eis forced by Lemma ITOl 
Let us first study the asymptotic behavior of the integral on 0. Letting F(9,t) = J e ng ^dcf) with 

,2/7) „ 



we find the root 0o of q'(<f) = to satisfy 

2 4 + ,4sm 2 (2(#-r)) 



sin 



A + 2A + 2A cos(2(0-t)) 



(cf. (fTHl). It is easy to see that <fio < tt/2. By the calculation following ( fTHl we know that also in the present 
situation 4>q > 9 — r, so the asymptotics of F(9, r) depends on the mutual location of <^ and 9 + r. Thus 
we obtain for the error exponent E e (R, A, r) = n~ l In P^ e (C) 

4(^,At)> / max (-&(*) -#o)) 

where #o = if <^o < + r or $o = + r otherwise. The first situation usually occurs for high code rates, 
and the last for low rates. As above, the second term in d2~Tl can improve the high-rate case. 

We conclude this section with studying error detection with spherical codes. Generally error detection 
proceeds as follows: if the received vector y is contained in C, the decoder outputs y, otherwise its output is 
undefined. Clearly for any finite-size code C C S 71 ^ 1 the probability of undetected error is zero, therefore 
we define error detection as a limiting case of bounded distance decoding and study the behavior of P as 
r — ► 0. Since the code is a finite set, the cumulative measure of spherical caps about code vectors tends to 
zero if so does their angle. Hence the error probability Pd e (C) is determined by the decrease rate of the area 
of a spherical cap. Assume that C is a code with distance 9 separated from and distance profile b{9). If 
T = 0, then by fl3 we have 

sin 2 0o - sin 2 9 = 4(1 - sin 2 9) > 0; 
so by continuity for small positive r also sin 2 4>q > sin 2 (# + r). Hence we obtain 

1 , / tan 2 



Since for r — > 



tan 2 (6>-r) 8 _ 2 



1 - -t + 0(t 2 



tan 2 (# + r) sin26>' 

we conclude that the probability of undetected error essentially does not depend on the distance profile of C 
and behaves as 

P ue ^ exp(nln y^rcsc 29(C))) = (8r esc 29(C)) n/2 . 

We see that basically one and the same behavior can be claimed for any code with minimum distance 9 
separated from 0; thus the asymptotic answer for the undetected error rate of spherical codes is known 
exactly (unlike the more difficult Hamming case where it essentially depends on optimal codes). 

Acknowledgment. Thanks to an anonymous referee for pointing out a potential error in the original 
derivation. 
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A PROOF OF THEOREM |2] Let C be a binary linear code of rate R, distance d and weight distribution 
Ai(C), where Aj(C) = A%. The weight profile of C has the form «o(w) := h(u>) — h(8oy). Let F + denote 
the undetected error event and F_ the error-or-erasure event. Assume w.l.o.g. that the transmitted vector is 
all-zero and that e is the channel error vector. The probability of the error events can be bounded above as 
follows: 

P(F±) < P(F±\e e S r ±(0))+P(e g S r ±(0)) 
for some positive r + and r~. Below we choose = d ± 2t. More concretely, we have 

P(F ± |eG5 r± (0))< A W (C) p e (l-p) n - e Y,Pe,s (22) 

w=d e=w/2±t s=0 

More accurately, the range of the summation index e in the above expression is w > \w/2\ ± t if w is odd 
and ui/2 + l±tifu;is even; we will ignore this. Let us proceed with the undetected error case and rewrite 
the estimate in an explicit form, substituting the value of A w : 

2d+2t / x d+2t e / \ / 

p(f +) <2-»-«> w;, £ ✓<!-,)■- e (:)(::; 

w=d V 7 e=w/2+t i=rfl+* 

+ E (^)p e (l-p) n - e - (23) 

e=d+2t+l ^ 7 

To facilitate transition to this expression from d22l) notice that if c is the incorrect codeword of weight w > 
output by the decoder and e is the error vector then the index i = \ supp(e) n supp(c)|. 

The product (™) ( n ( Z_^) is maximized for 

ew (d + 2t)w w 

i^ — < y - >—<- + t, 

n n 2 

where the last step follows (for large n) by the assumption of the theorem R > 1 — h(l/2 — r) which 
translates into <5 GV (-R) + t < 1/2. Therefore the sum on w in d23l for large n can be estimated from above 
by 



-^-^£(3U + .)(.-.V-.) 

(since (")p™_ 2 t e = (g)Pe-2i «>)■ m tne sum on w we are counting the number of vectors of weight w which 
are distance e — 2t away from a given vector of weight e. This number is maximized when 

e - t - f e - 2t 



e n 

Introducing the notation w = ton, e = pn, we can rewrite this relation as 

uj* = 2p(l - p) - 2r(l - 2p). 
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Thus the expression in d25t is = -equivalent to 

( e )/(i - p) n e ( p(n _ e) + 2tp ) ( (p _ 2r)(n _ e) 

-E(:)( e : 2( )^- P )- 

^ max exp[-n( J D(p||p)-/i(/9-2r))l. (26) 

(5 GV /2+T<p<5 GV +2r 

The last exponent is maximized for p = p^, and thus the unrestricted maximum on u> is attained for uj = ujq. 
The cases (a)-(c) of the theorem are realized depending on how these values are located with respect to the 
optimization limits 

u > 5, S GY /2 + t < p < 5 G v + 2t. 

If both po an d w o satisfy these inequalities, we substitute them into (126b . recall the factor 2 Rn ~ n from d23t 
and arrive at case (b) of the bound M + in ©. 

If Pq" > 5gv + 2r then we substitute p = 5qy + 2t,uj = uj* and obtain the expression 
D(5 GY + 2t\\ P ) - h(5 GY ) + (1-R) = D(5qy + 2r||p), 
i.e., case (c). Finally if ujq < 5, we substitute w = d in d24l and obtain 

2— e ^-rt-Q^y n - rf 



e=d/2+i 



d/2 + t) \e-d/2-t 



^ I max I ^ Ir/Yl - o)™" 
d/2 + V a>d/2 Va - d/2y F v F > 



The last maximum is attained for a — d/2 (n — d)p. Substituting and switching to exponents, we arrive 
at the case (a) in (0. 



A proof is needed to show that in this case the first of the two terms in (1231 provides the dominating 
exponent; this is a straightforward calculation which we shall omit. This completes the analysis of the 
undetected error event i*Y. 



Let us sketch the proof in the error-and-erasure case F_ . Now the sum (1221 can be written as 

1d-1t , x d-2t e / \ / \ 

2 R "-E Q E rti-rt- E I 7-1 > 

w=d v 7 e=w/2-t i=w/2-t v 7 v 7 

We would like to prove that the maximum on i which is again attained for i rj ew/n, at least for large n 
falls below w/2 — t. This will follow from the inequality 

u)5 -^<{2u) - l)r 

which is proved as follows. We can assume that uj < 1/2. By assumption, i < pn/2 and hence t < d/2 
since 5gy(R) > p for i? < C. Then 

w5 - | + (1 - 2oj)t < -(6 - uj) < 0. 
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Hence for any p < 5 — 2r we have 



ujp < lo(5 — 2r) < — — r 



as desired. So instead of (I24l> we obtain the expression 

d-2t 2(e+t) 



t ^-)-«E(:)U_ ( )(e-";A,)- 



e=d/2—t w=d 



The remaining part of the analysis of this case proceeds as above except that t is replaced by —t throughout. 
In particular, u>* = 2p(l — p) + 2r(l — 2p), the optimum on p is attained for p$ and so on. I 
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